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Abstract 

Many natural, technological, and social systems incorporate multiway interactions, yet are char- 
acterized and measured on the basis of weighted pairwise interactions. In this article, I propose a 
family of models in which pairwise interactions originate from multiway interactions, by starting 
from ensembles of hypergraphs and applying projections that generate ensembles of weighted pro- 
jected networks. I calculate analytically the statistical properties of weighted projected networks, 
and suggest ways these could be used beyond theoretical studies. Projected weighted networks 
typically exhibit weight disorder along links even for very simple generating hypergraph ensembles. 
Also, as the size of a hypergraph changes, a signature of multiway interaction emerges on pro- 
jected weighted networks that distinguishes them from fundamentally weighted pairwise networks. 
I find the percolation threshold and size of the largest component for hypergraphs of arbitrary 
uniform rank, translate the results into projected networks, and show that the transition is second 
order. This general approach to network formation has the potential to shed new light on our 
understanding of weighted networks. 

PACS numbers: 89.75.Hc, 02.10.Ox, 64.60. ah, 89.65.-s 



Electronic address: eduardo.lopez@sbs.ox.ac.uk 



I. INTRODUCTION 



Recent years have seen the growth of complex networks theory, a research area concerned 
with the general theory of systems of interacting elements l|. Its relevance has been illus- 
trated in a number of problems, such as infectious disease propagation [2], the strength of 
social ties data routing in technological networks {4], and motifs in biological networks jsj. 
An underlying driver for the growth of this field has been the increased availability of dig- 
itized information, which can be efficiently analyzed to uncover relations between system 
elements. 

A simplifying assumption that is made in networks theory is to characterize interactions as 
being exclusively pairwise (each interaction represented by a link between two nodes), often 
with an associated interaction intensity or weight, generating so-called weighted networks. 
The reason for this approach is that usually the information available for real systems is 
relatively limited. Despite these limitations, weighted networks have proven very useful, 
as a number of measurable network quantities have shown their relevance in application. 
Examples of these quantities are the distribution of node degree (number of links connecting 

n □ fi 

to a node) p, |7| , optimal path lengths between network nodes [8] , and node clustering (a 
measure of loops of length three) [9(. Other properties that depend on specific groups of 
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links (e.g., network communities) have also proven quite useful 

There are situations, however, where it is known that interactions extend to groups 
larger than two (multiway interactions), and one can use such information to create more 
accurate models, avoiding the possibility of oversimplified or misleading results. Examples of 
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where nodes representing 



these situations are, for instance, networks of affiliations 
individuals connect to each other by virtue of their membership to a group such as their 
family or workplace colleagues; another example are folksonomies 15[], systems that encode 
information of triplets of the following three ingredients: objects, descriptors of the objects, 
and the individuals making the descriptions. Characterizing these examples by avoiding the 
pairwise simplification should lead to more informative and reliable results. In this article, 
I attempt to provide a method to statistically study systems of multiway interactions and 
relate them with usual network analysis. 

Researchers focusing on problems of multiway interactions have proposed mechanisms 
by which weights are generated as a consequence of these interactions [121 ] . For instance, 
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in affiliation networks, when two nodes belong simultaneously to multiple groups, a fea- 
ture called co-membership, it is assumed that their relationship intensity is equal to the 
number of groups they both belong to. Perhaps the most appealing feature of these ideas 
is that they provide a unifying principle to the structure of some interacting systems: the 
presence of a group generates links, and being part of multiple groups generates weights. 
Surprisingly, these unifying ideas have received limited attention, perhaps because some of 
the mathematical models that are required are less straightforward than typical networks. 
To model multiway interactions, it is appropriate to use hypergraphs, which are general- 



izations of networks 



16] . They are composed of a set of nodes and a set of hyperedges. Each 



hyperedge is a group of interconnected nodes (a clique), and the hypergraph is the collection 
of all the hyperedges and isolated nodes; networks are the specialization of hypergraphs in 
which all hyperedges are cliques each with only two nodes. The size of a hyperedge is called 
rank. Hypergraphs are called homogeneous when all hyperedges are equally likely to be 
present, or heterogeneous when each hyperedge has its own (possibly unique) probability to 
appear. For the examples mentioned above: in a folksonomy, for instance, hyperedges are 
all of rank three, whereas in affiliation networks, in principle, hyperedges can have different 
ranks; both examples are likely to be heterogeneous hypergraphs. 

The notion of hypergraphs generating weights is equivalent to constructing networks that 
represent a projection of a hypergraph. In other words, starting from a hypergraph, one 
can create an associated set of links that form a projected weighted network, where each link 
weight is given by the structure of the hypergraph and a projection rule. This construction 
suggests some intriguing possibilities: some data that is typically studied as a network may 
in fact emerge from underlying hypergraphs. If that is the case, it should be possible to 
construct hypergraph models and accompanying projections that can fit observed data and 
narrow down its origins. 

In this article, I study homogeneous and heterogeneous entropy maximizing hypergraph 
ensembles of arbitrary uniform rank r and define general projections of hypergraphs that 
lead to ensembles of projected weighted networks. The properties of different projections 
are explored, relating them to measurable network quantities, and suggesting ways to chose 
the appropriate projection. The percolation threshold and size of the largest connected 
component of hypergraphs of arbitrary uniform rank are also derived by use of the mapping 
between the Potts model and percolation theory [l?]] , and the results are then translated into 
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the percolation properties of the projected networks. These results show that the transition 
is of second order. I find that, as a function of size, there are measurable quantities on 
projected network ensembles of hypergraphs that represent signatures of the presence of 
hidden multiway relations: when faced with a weighted network, these signatures could 
provide indications that there is an associated hypergraph hidden in the data. 

The article is structured in the following way: I first focus on the general definitions of 
projections of hypergraphs onto networks, and on models of entropy maximizing ensembles 
of hypergraphs. With these results, I then study in greater detail the statistical properties of 
general projected networks, as well as some concrete examples of projection that are likely 
to occur in empirical and theoretical studies of this problem. These results suggest how to 
explore network data for possible signatures of multiway relations. Completing the results, I 
focus on the percolation properties of hypergraphs and their projected networks, and explore 
the general notion of sparsity. I finalize the article with some discussion and conclusions. 

II. MAXIMUM ENTROPY HYPERGRAPHS AND THE NETWORK PROJEC- 
TION 

Consider a hypergraph, represented by cr, consisting of a set of nodes 1, . . . , N, and for 
each possible hyperedge of r nodes i±, . . . , i r , an indicator cr ilimmmiiT equal to 1 if the hyperedge 
is present and if it is absent; all subindices ii, . . . , i r take non-repeated values from the set 
{1, . . . , iV}. In general, a hypergraph does not require r to be the same for all hyperedges. 
However, for the sake of simplicity, I focus on single rank (all hyperedges have the same 
r) undirected hypergraphs, with the indicator <J il; ^ ;ir symmetric under permutations of 
ii,...,i r (if one is interested in studying combinations of rank, one merely requires the 
introduction of the proper parameters for this, but the qualitative nature of the problem is 
the same as that studied here). 

The general hypergraph projection onto a network is defined as a function V applied over 
hyperedges of cr that produces the adjacency matrix Wij for the projected weighted graph 
G. Network G is formed by the same node set as cr, and its adjacency matrix is w^. If a 
node does not belong to any hyperedge, it is isolated in both cr and G. For given cr, one can 
define the subset Oij(cr) := {(ii, . . . , i r )\(ii, ■ ■ ■ ,i r ) £ cAi G {ii, . . . , i r } Aj G {ii, . . . , i r }} of 
its hyperedges that include simultaneously nodes i and j. The kinds of projections studied 
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here are of the type 

w v (cf) = v(\o ij (tr)\) 1 (i) 

where o^- = |Ojj(er)| is the size (cardinality) of Oij(er). Thus, the weight of link ij in G only 
depends on the number of hyperedges that contain i and j, an intuitive choice, although 
certainly not the only possible model. 

On a concrete empirical case, the projection V should reflect the understanding of the 
relation between a and G. Here, I present results for some reasonable sample choices of 
V(Oij), namely V a (Oij) = Oij (additive projection) and V n {Oij) = 9(oij) (nominal pro- 
jection), where 9 is the Heaviside step function (= if the argument is or less, and 1 
otherwise). In addition, I show some features satisfied by the projected networks generated 
by a large class of projections with the general form of Eq. ([1]). To perform calculations, 
note that the additive projection can written in terms of (T^ ^ as 

V a {Oij{(r)) = o i: j = ^ c<i,...,ir) ( 2 ) 

(i 1 ,...,i r )eO i j (er) 

whereas the nominal projection is represented by 

7>n(0;») = 9 I J2 °' ' • ( 3 ) 

\(il,...,v)eOy(o-) J 

An illustration of V a for the case of r = 3 is shown in Fig. [TJ 

In the literature, both hypergraphs and projections have been used to study interaction 
data qualitatively embedded in complex networks theory, but without a sense of unification. 
For instance, the choice V n {Oij) is implicit in work such as [la]; there, if cr^ i r is interpreted 
as a specific motif (structural pattern), the model generates unwei ght ed networks guaranteed 



to posses those motifs. In another approach, found in Refs. 19|, |20|], each hypergraph 



(containing r = 2 and 3 only) treats each rank separately in that the interactions of nodes 



by way of pairs is counted independently to 



projection onto a simple graph. References 



;he triplet interactions, with no notion of 
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151 ] do consider projections in some form, 



but are limited by rank r of hypergraph and by the nature of the projection. Projection 
V a is in fact common [12], and it is often used as a way to characterize the one-mode 
networks that emerge from bipartite graphs ]21]. Equation (JTJ offers a different way to 
relate graphs and hypergraphs, which can be applied in the previous models to develop 
additional understanding of the problems. 



To build unbiased statistical models, I adapt the canonical ensemble approach developed 



in Ref. 



22] to hypergraphs. The set of all possible hypergraphs er is given by {<x} con f (the 



ensemble), or in other words, {cr} conf is the union of all possible unique hypergraphs cr. To 
analytically formulate the ensemble problem, consider the entropy S, defined as 



S = - P{<r)hP(tr), 

W }conf 



(4) 



where P(<r) represents the probability of a given configuration within the hypergraph en- 
semble, and the sum over configurations is equivalent to summing over all hyperedge combi- 
nations, or y\„i — > Y]„ _ n • • • Y]„ _ n . The canonical ensemble approach finds 
the distribution P(cr) that maximizes S while satisfying conditions that define the ensemble 
of interest. Such conditions, say {(X a )}, with a an enumeration index, are taken to be of 
the form 

E X a {<r)P{a) = (X a ). (5) 

{& }conf 

Finally, since P(<r) are probabilities, one must guarantee normalization, which translates 
into 

E = L ( 6 ) 

}conf 

The solution to this problem (-P(cr) satisfying the conditions above) is obtained via Lagrange 
multipliers. Each condition is related to a multiplier, and one solves the equations 



d 



dP(cr) 



s+ v [i- e p (°-) +E&* (w- E x ^)p^) 

{f}conf / a \ {o-}coiif 



0, (7) 



for P(cr), with rj,/3i,... the Lagrange multipliers. The solution to the problem can be 

expressed as 

P(cr) = (8) 



The partition function Z, and H(cr) (defined as the Hamiltonian) , are respectively given by 



Z = £ e- 

{f }conf 



H(a) 



and 



(9) 



(10) 
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Among the simplest non-trivial problems one can address is that of the fully random 
hypergraph with equal probability for any hyperedge to exist. The constraint associated 
with this example is the requirement that there is a given average number of hyperedges, 
(L r ), over the hypergraph ensemble. Since L r for a given configuration cr is given by L r (er) = 
i r )ea- a h,-,irJ the set of constraints reduces to two Lagrange multipliers, one for the 
normalization, and another parameter, labelled /3, for (L r ). Introducing this in Eq. (J7J) 
generates the Hamiltonian 

H(tr)=P <ni,:.,ir = PLr(v) (11) 

(ii,...,i r )e<T 

and the partition function 

Z= ^ ^ g~' 3 E(i 1 ,...,ir)e°- °*l,...,«r 
{& }conf 

= e ••• t n " 

CTi,...,r=0 cr N _ r+1 ^.^ N =0 (ii,...,i r )eT(N,r) 

1 1 

= e-P ai - r ■ ■ ■ e- p<TlT - r + 1 -- N = (l + e" /3 )(-), (12) 

Cl,...,r=0 O"jv-r+l,...,JV=0 

where T(N, r) is the set of all possible hyperedges {(1, . . . , r), . . . , (N — r + 1, . . . , N)}, i.e., 
the complete hypergraph of single rank r and size N. The last equality can also be obtained 
from the symmetry of the Hamiltonian over exchange of indices among i T . The result 
expresses that there are ( ) possible hyperedges («!,..., i r ) among the N nodes. Using this 
result one can show that the (L r ) constraint is satisfied for the proper choice of /3, as seen 
from averaging L r (er) in the P(<r) ensemble 



P, 



{o-} C onf (ii,...,ir)6<r L r =0 



(13) 

and p = (1 + e' 3 ) -1 is the probability for a hyperedge to be present, which is evident from 
writing (L r )/( N ) = (1 + e^)" 1 = p; p also corresponds to the expectation value of any 
hyperedge (<T iu ... )ir ) = J2{a-} coni a h,...,irP( cr ) = (1 + ^Y 1 = V, i-e., the probability for any 
hyperedge to exist. The fact that all hyperedges are equally likely suggests referring to 
this case as the homogeneous hypergraph ensemble. The probability of a specific hypergraph 
configuration to be observed is given by Eq. (JSJ), which in this case yields 

P(a,p) =p L ^\l - p )i N r)- L ^ (14) 



where the relations l + e"^ = (1— p)^ 1 and e _/3 = p(l— p) -1 have been used. The application 
of the V a and V n to the homogeneous ensemble is tackled below in a more general ensemble. 

The solution to the simple homogeneous problem above, helps to identify some basic fea- 
tures of the canonical approach, including quantities such as the probability of a hyperedge, 
and of a specific hypergraph cr. Building on this, we can construct the more general hetero- 
geneous case, where each hyperedge has its own expectation (er^...^). Thus, the hamiltonian 
of Eq. (JTDJ) becomes 

(il,...,i r )eer 

In analogy with the homogeneous case, one defines p^,...,^ = (o'i 1 ,...,% r ) — (1 + e /3 *i-- <r )~ 1 . 
The partition function becomes 

z(p)= n ( i+e_fti i, -)= n (i-fti,...,*.)- 1 , (i6) 

(il,...,V)eT(JV,r) (ii,...,v)GT(V,r) 

where p represents the hyperedge expectations {pi,... jr , . . . ,pn-v+i,...,n}- The probability of 
a hypergraph configuration cr is then 

p(cr, p ) = j] /^:r ( i - /'n ,) ; ,!i '• (17) 

(ii,...,v)eT(JV,r) 

which is the joint probability that hyperedges with 0"^,...,^ = 1 are present, and those with 
°ii,...,ir = are absent. If for all (zi, . . . , z r ) G T(iV, r), Pi ly ., t i r = p, one recovers the homo- 
geneous case. The heterogeneous ensemble possesses the most degrees of freedom among 
non-interacting undirected hypergraph models. If more specific constraints are imposed such 
as, for instance, conditions on the average number of hyperedges visiting a node, they would 
translate into additional constraints on the values of the set p. 

III. APPLICATION OF THE HYPERGRAPH PROJECTION 

Since only projections of the form V(Oij) = V(oij) are considered here, the statistical 
properties of the projected networks depend on the statistical properties of Oy. It is most 
useful to focus on the distribution of in the heterogeneous ensemble, ^y(oy, p), and deter- 
mine how this translates into the homogeneous case (Table U summarizes the notation used 
to compute ^(oy, p)). Let us define T i j(N,r) as the set of all hyperedges on the complete 
hypergraph that visit i and j simultaneously. We also define Ty(iV, r), the complement of 
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Tij(N, r) with respect to T(N,r). In addition, for each configuration cr, Vy(cr) is the set 
of hyperedges visiting i and j which may or may not have cardinality (if it does, it is 
represented as before with Oy(<x)). The complement of V^(<x) with respect to T i j(N,r) is 
^•(cr), and thus Ty(iV, r) = \4, (cr) |J V^cr). Taking this into account, <frij(oij,p) can be 
calculated through the expression 




°h,.,ir I P(«r,p), (18) 

(ii,...,i,-)eVy(o-) 



where 5(x, y, . . . ,z) is the Kronecker delta which can have two or more arguments, and is 
equal to 1 if all the arguments are equal, and otherwise. In the sum above, only those 
configurations for which delta is 1 contribute to <pij(oij, p), and this occurs only when there 
are exactly hyperedges in cr that include ij. 

To perform the calculation, note the independence of each component of p in Eq. (fT7|) . 
This allows factoring the sum over configurations in Eq. ( jl8j) into a product of i) the con- 
figurations of hyperedges Ty(iV, r), which cannot affect the delta, and ii) the configurations 
of hyperedges Tj.,(iV, r), which can. The hyperedges (i 1; . . . , i r ) 6 Ty(N, r) each contribute 
a factor . =q P^, 3 .'.,*^ (1 ~ Pi±,...,i r ) l ~ ail ""' ir = 1- Therefore, the remaining factors of 

Eq. (HHD lead to 



0iiK,p)= s [°ij> E °"h,...,v II /v.'.:.:,, (i-p« »,)' 

V^j(tr)CVy \ {h,...,i r )eV ij {(T) J {i 1 ,...,i r )£T ij (N,r) 

= y 6 (%' e ^i. --*•] n p*i.-.<r n ( i -Pn,-,v) 

Vij(o-)cV ij y (ii,...,t r )e^(o') / (ix,...,ir)eVij-(<r) (i 1 ,...,i r )eV ij ((T) 

= e n p*i.-».<r n d •-/''. 0. (is) 

O ij (<r)cO ij (i lr ..,i r )eOy(o-) (ii,...,i r )60 ij -(tr) 

where and are the unions of all possible sets Vij(er) and Oij(cr), respectively, and 
Oij (a) is the complement of Oy(er) with respect to Tij(N,r). 

Equation ( fT9l) has been expressed in a way that makes it straightforward to explain and 
convert into an algorithm for calculation, as I attempt to explain now. The expression can be 
described in the following terms: i) separate the hyperedges from T(N,r) into two groups, 
one that can influence ij over all possible configurations, namely T^(iV, r), and another 
that cannot (Ty(iV, r)), ii) identify out of Ty(iV, r) the hyperedges of cr visiting i and j, 
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Vij(cr), iii) only when |Vij(<r)| = |0^(<t)| = Oij, cr contributes to <pij(oij, p), and iv) since 
there are numerous ways to choose Oij(cr) from Ty(iV, r), one requires the set Oy, which 
contains all those choices, i.e., is the ensemble of allowed configurations. Consequently, the 
last line of Eq. f[T^]) can be read as the sum of probabilities over all possible configurations 
of hyperedge sets Oij(er), where each hyperedge belongs to Ty(iV, r). Note that there 
are |Ty(iV, r)| = (^T 2 2 ) hyperedges to choose from and each Oij(er) (configuration cr) picks 



of them, and therefore |Oy| = (^ r ~ 2 ')- It is worth mentioning that cr is used in Oij(cr) 
to emphasize its origin as a particular hypergraph configuration, but that it becomes re- 
dundant when the meaning of O^- is fixed as the collection of configurations (the ensemble) 
contributing to ^(o^-, p); at this point, each specifies a unique configuration and no 
further reference to cr is necessary. 

In fact, dropping cr from offers a combinatorial picture for the last line of Eq. f|T9|) 
and other distributions in this section. Since each cr is a set of hyperedges connecting 
non-repeated nodes in cliques of rank r, one can think of each hyperedge as an r-tuple of 
non- repeated indices taken from {1, . . . , N}, and a configuration cr as a collection of non- 
repeated r-tuples. Therefore, T(N,r) is the collection of all possible r-tuples, Ty(iV, r) the 
subset of T(N,r) containing all r-tuples that have indices i and j simultaneously, each 
a sample without replacement of Ojj r-tuples taken from Ty(iV, r), and Oy the collection of 
all possible samplings. This way to think about Eq. f|T9|) transfer the emphasis from a graph 
theoretic problem to a purely cominatorial one. The cardinalities of all the sets calculated 
before follow naturally. 

The average of can be determined from Eq. ( JT9l) . through (ojj(p)) = 
J2o[j=o Oij<fr(oij,p), or by calculating Y.{*} conl Oij{(r)P(a). The result is 



which fits intuition, stating that the expectation of the number of hyperedges visiting the 
pair ij, is the sum of expectations of each hyperedge that can visit ij to be present over the 
ensemble. 

In the homogeneous case, where Pi u ...,i T = p for all (ii, . . . ,i r ), 0y(oy,p) — > 4>ij(oij,p) 
becomes 






(20) 



(ii,.. .,ir)€Tij(N,r) 




(21) 



10 



a binomial distribution with (o^) = (^_ 2 2 )p (Fig- Efa))- This average has an interesting 



interpretation explained below regarding what sorts of signatures a multiway interaction 
may provide in observational studies. Another noteworthy fact stemming from Eq. (ED), 
even in this very simple case of homogeneous p, is that links display disorder in Oy, and 
this could easily pass on to the projected networks in many kinds of projections. Further 
structure can be given to this disorder by changing the projection and/or the hypergraph 
ensemble. 

Some general features of V can now be described. First, note that monotonic smooth 
projections, satisfying the inverse function theorem, offer a way to formally write the dis- 
tribution of Wij from the distribution of Oy because there is a one-to-one relation between 
the two quantities. Defining the distribution ^(wy, p), the change of variables theorem for 
probability distributions implies 



where V is the derivative of V . The additive projection V a satisfies such conditions in a 
trivial way because it is just the identity function. However, a large class of functions also 
satisfy these conditions, including all power law and logarithmic growth or decay functions 
(when decay applies, one must be mindful of additional conditions). The nominal projection, 
on the other hand, does not satisfy the condition because any value of > 1 leads to the 
same weight = 1, and thus the inverse of V n is not defined. 

An important feature of these mappings is patent in Eq. ( 12 ip : the distribution of o^- 
is narrow, with relative width decaying as ( ~ ) , so as iV grows, more of the mass of 
the distribution is concentrated around its maximum, labelled o*j, which coincides with the 
average (oy) = i^Z^P f° r large N. It is then expected that for a wide range of possible 
projections, asymptotic estimates of \x^{wi^p) can be straightforwardly obtained. 

An interesting observation emerges from the previous results. Equation ( 12 ip predicts via 
its average that in projected networks of homogeneous hypergraphs the interaction weight 
between nodes in the system increases with (^J 2 2 ) or roughly N r ~ 2 for large N and finite 
r > 2 (but N ^> r). This also indicates the following: since nodes added to the hypergraph 
only establish interactions with other nodes by way of hyperedges, the addition of these 
nodes increases on average the interaction weights among all nodes, i.e., new and already 
present. In contrast, with only pairwise interactions, the addition of new nodes would not 




(22) 
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contribute to the weight of interactions of nodes that are already present, because the new 
nodes require only direct new connections to existing nodes one at a time, and thus do 
not effect changes in the weights between existing nodes. This mechanism distinguishes 
networks that are fundamentally pairwise in origin to those which may appear as pairwise 
due to data collection or other factors but are, in fact, fundamentally hypergraphs. This 
observation may suggest tests to distinguish the two scenarios. 

The two projections V a and V n can now be explained further. For V a , the properties of Wij 
are those of o^, and thus already calculated. The other property to describe is the so-called 
strength Sj of node i, equal to J2j °%y ft i s intuitively helpful to calculate the distribution of 
strengths £j(sj, p) by making use of the relation between Sj and ii, the number of hyperedges 
visiting i. These two quantities relate via Sj = (r — l)£i, and one can determine the distribu- 
tion d(£i, p) of li and from it compute £j(sj, p). Note that while s, is a property of the graph, 
ii is a property of the hypergraph. Once again, the independence of the components of p 
simplifies the sum over configurations {cr} con{ (notation in Table HT|) . The hyperedges that 
could affect belong to Tj(iV, r), the collection of all hyperedges visiting i in T(N,r), and 
Li(cr) is the set of hyperedges from Tj(iV, r) in configuration er (when |Lj(<r)| = ii we write 

it as Ai(o-)). From the definition &(4 p) = E{ CT } conf <K4 E(i 1 ,„.,i r )£L i ( ff ) cr w,.,ir) p (°"; P). one 
can quickly conclude that 

c*(4p)= e n p*i--* n (•-/''. < )• ( 23 ) 

Ai(<T)eAi (ii,...,v)eAi(<r) (ii,...,i r )eAj(£r) 

where Aj is the ensemble of configurations Aj(cr), and Aj(cr) |J Aj(cr) = Tj(iV, r). Then 

to,p) = Ci(si/(r-l),p)/(r-l), (24) 

where Sj takes values 0,r — 1, 2(r — l),...,(r — 1) • Once again, an equivalence between 
hyperedge sets and combinatorics can be drawn: Tj (N,r) is the union of all r-tuples drawn 
from {1, . . . ,N} with one element always i, and thus there are |T»(iV, r)| = ("i 1 ) ^-tuples 
in total. Each Aj is a distinct choice of ii of these r-tuples; clearly |Aj| = (^ r ^ 1 ^)- The 
sum ^2 Xi £Ai ^ s a sum over an choices of ii r-tuples from Ti(N,r). The averages of these 
quantities are given by 

(ii(p))= E /'m , (25) 

(ii,...,t r )eT i (JV,r) 
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and 

(*(p)> = (r - l)(£(p)> = (r - 1) P*i-*r- ( 26 ) 

(ii,...,v)eTi(JV,r) 

For the homogeneous case, Q(£i,p) = (( r ~ 1 ');p* i (l — p)^- 1 ) - ^, with average = ("^J^p. 
Therefore, 



W(r - I)/ 
and = (r - 1) (see Fig. 12(b)). 

The nominal interaction P n needs a different treatment. Note that under this projection, 
Wij can be either or 1. To determine the probability for w^, 7Tjj(i%, p), one merely needs 
to determine the probabilities that is either or > 1, that is 7Ty(ii7y, p) = <j>ij(c>ij = 0, p) 
or 7Tij(wij = 1, p) = 1 — <f>ij(oij = 0, p). Therefore, 

% K- )P ) = < 1 ~ n(i 1 ,...weT„ W r ) (i -Pii,...,v); ^ = i (28) 



Yl(i 1 ,...,i r )eT ij (N,r)(^ Ph,-,ir)'i Wij—0. 



In the homogeneous case, 




, . , - Wij = 1 

Kij(wij,P) = { , ^ N -2\ ( 29 ) 

w^ = 0. 

These results have implications for the average number of connections for each node of a 
projected network, as we explain next. 

For network projections V a and V n , the number of connections ki visiting node i are 
characterized by ^(A^p), the distribution of ki. The degree can be either or take any 
value from r — 1 < fcj < N — 1. To determine ipi(ki, p) (notation in Table IIIip . one can 
proceed in a similar way as before: in configuration <x, the set of hyperedges visiting i and 
producing degree ki is Ki(er). This means that hyperedges in ifj(cr) visit exactly fcj nodes 
and node i. It is interesting to note that another configuration <x', associated with Ki(cr'), 
with a different set and/or number of hyperedges can lead to the same ki, because these 
hyperedges still visit the same number of nodes ki (see Fig. [3] for an illustration). With this 
definition, one can write 



uh,p)= Yi n /'< - n {i < ! - ( 3 °) 

Ki(cr)eKi (ii,...,ir)eKi(<r) (i 1) ...,i r )eK i (a-) 
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where Kj is the union of all possible sets Ki(cr), and the complement set Ki(cr) satisfies 
Ki(cr) |J Ki(cr) = Tj(7V, r). Since the number of hyperedges is not fixed across members of 
Kj, one can further organize the Ki(er) by their numbers of hyperedges £i(cr). The bounds 
of li are dictated by the following: for degree ki, a minimum of \ki/(r — 1)] hyperedges 
is required ([.] represents the ceiling function), and there can be no more than ( hy- 
peredges. Using this organization, and introducing the notation Kf l \&) and Kj(^j) to 
represent, respectively, the sets Ki{cr) involving exactly hyperedges and their unions, one 
can write 

{r\) 

ipi( k i,p)= e e n ?« <r n (i-Pu,..,^)- ( 3i ) 

The sets Kj(£j) are only subsets of Aj in which the hyperedges involve exactly i and ki other 
nodes. Finally, it is possible to exploit one more symmetry that facilitates an algorithmic 
understanding of ipi(ki,p): the sets that make up Kj(^j) involve several possible distinct 
node sets. However, one can further segregate these sets by the specific nodes in them. 
Hence, if one takes a set, p(ki), of ki specific nodes and i, there are several configurations 
in which their associated Kf l \<r) contain ti hyperedges visiting only those nodes. Thus, 
a configuration with specific p[ki) nodes connected to i, using £i hyperedges is labelled 
j{p(k z )A) an( j foe union of configurations is labelled I i (p(A; i ), ti). The union of all sets 
Ii(p(ki),£i) (which are non-intersecting) is equal to Kj(^). This leads to the final expression 

^i(ki,p)= E 

p(ki)£~Ri(N,ki) 

(A) 

EE li li (l-ftl,...,ir), (32) 

ti=\vk\ 4 P(ki)A) (T)eiMk,)A) (ii,...,i r )ei^ ki) ' ti) (<r) (i u ...,i r )ei^ ki)A \<r) 

where if (fci)A) (<r) is the complement of J<" (fci)A) (<r) with respect to Tj(iV,r), and Rj(iV, h) 
is the union of all possible p{ki), each one a distinct (ki + l)-tuple taken from the set 
{1, . . . , N} with one choice always being i. The sizes of sets are: |Rj(iV, kj)\ = t^ -1 ); and 
\li(p(ki), £i)\ = Q r -i(ki,£i); the later is the result of a combinatorial problem that can be 
defined in terms of general graph theory. Specifically, Q r _i(ki, corresponds to the number 
of distinct graphs that can be constructed with ki nodes all of which belong to at least li 
cliques of size r — 1. In fact, each J^ fc< )'^(<r) can be mapped to each one of these graphs. 
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To determine (A;»(p)), it is simpler to use the relation 



N N 

(kip)) = e H<r)p{°) = E E ^(%»)^(«-) = E E 

Wccnf {o-}conf 3 = ^,3 ¥=i 3 = ^3^ {<r}conf 

(33) 

By first summing over a single j, one notices that only hyperedges in Ty(iV, r) are relevant, 
all others contributing a factor of 1, and that 



»(»,») = » 



a 



i\,.„,i T 



(ii,...,v)eOy (<r) 



1-5 



E ^Wr' 
(ii,...,i r )gOij(<T) 



(34) 



(**(p)>= E 



(35) 



one arrives at 

N 

n (i-fti,..^: 

(u,...,i r )GT ij (Af,r) 

When compared with Eq. ( 128]) . it becomes evident that each link ij contributes to hi inde- 
pendently. 

In the homogeneous case, making use of the combinatorial results presented, one obtains 

(Fig. Ei;c)) 

'N-V 
I' . 



ipi(ki,p) 



(£0 



(36) 



E Qr-i(ki,ii)p ei (l-p) 

HA] 

Without diving into too much detail, Q r -i{ki,£i) can be calculated via the inclusion- 



exclusion principle of combinatorics 



25j, which produces 



Q r -i(k h £ i ) = J2(-l) k > 



m=0 



(37) 



Among the identities satisfied by Q r ^x(k,£), one finds that (W 1 -') = J2k 1 ) < 5r-i(^, £), 
which is used to show normalization of tp(ki,p). Another identify, /c( Ar fc " 1 )Q r ._i(/c, £) = 
(N-l) 



r-l)^ Ar-1 



, leads to the average of if>i(ki,p), 

(h) = (JV-1) [l-(l-J9)(-2) 



(38) 



where the brackets are equal to 7Ty(wy — l,p) from Eq. (1291) (see Fig. Efd)). This average 
can also be calculated directly from Eq. (1351) . 

To conclude this section, it is useful to point out how the previous results can be connected 
with concrete problems. The logic is similar to that found in 22|, |23j , in which the ensemble 
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is chosen to fit observations. In the framework presented here, it is possible to choose 
the hypergraph ensemble to fit hypergraph properties (such as Eqs. ( 1201) or fl25|) ). projected 
network properties (Eqs. (1261) or (I35p ). or a combination of both (as long as it is well defined); 
the choice comes down to practical considerations such as the available data one intends to 
fit, or the belief that certain mechanisms may be at play and therefore must be part of the 
model. Once an ensemble is defined (satisfying the assumptions of hyperedges which are 
non- interacting, undirected, and with uniform rank), the expressions derived above for the 
heterogeneous ensemble apply, but an additional set of constraints emerges for the Pi lt ...,i r 
guaranteeing that the entropy is maximized, distinguishing the situation from that of the 
fully heterogeneous ensemble, where each Pi ly ...^ r is free to have any value between and 1. 

As an example, consider the ensemble that specifies strengths (sj) on the projected net- 
works with projection V a - This can be constructed from the Hamiltonian 

N 

H{a) = PiSi(tr) = (r - 1) £ (/3 n + • • • + ir )a h ,... 4r . (39) 

1=1 (il,...,i r )£<T 

This ensemble is completely specified by calculating the relation between (o^,...^), by def- 
inition equal to Pi 1: ... t i r , and the set of parameters . . . ,(3n}- After determining -P(cr), 
one can compute (<Ji u ...,i r ) = S{ CT } conf °"n,...,v-P( cr ) to find 

e -(r-l)(ft 1+ ...+ft r ) 

(Oil,...,*r) = Pil,..,ir = 1 + e -(r-l)(ft 1 +...+ft ? .) ( 40 ) 

where the parameters satisfy Eq. (I2"6"i) . and therefore 

^ 1 _|_ e -(r-l)(P n +-+p ir ) 



^ ^ e -(r-l)0» il +-+/9 ir ) 



(h,...,i r )eTi(N,r) (h,...,i r )eTi(N,r) 

One way to understand this result is from the relation 

r 

dp ix ,...,i T = -(r ' <l\- (42) 

If only fag changes by d(3 ig , hyperedges without node i g are unaffected, and those with it 
all increase proportionally to dfii g . As in Ref. 22], the can be taken from a distribution, 
leading in turn to a distribution of (sj). This can be used to tune a desired distribution of 
(sj) as dictated by the problem. 
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IV. PERCOLATION PROPERTIES AND SPARSE CASES 



Another important aspect of the hypergraph ensemble and its projected networks is their 
percolation properties. To calculate these, one can use the equivalence, first pointed out by 
Fortuin and Kasteleyn 17J, between percolation and the mean- field g-states Potts model 
at q — > 1. The solution to the later model consists of determining the state of the nodes, 
and whether there is a phase transition. The solution and its properties can be obtained 
by studying the model's Helmholtz free energy. A detailed development of equivalence of 



the models can be found in Ref. 
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24j ; here, I set up the calculation starting at the free 
energy and develop the percolation properties from there. I consider the homogeneous case 
only, although it is in principle possible to solve some heterogeneous models. 

Consider the Hamiltonian of the general g-state Potts model with N nodes, H q = 
~ i r Jii,...,ir^i u hi ■ ■ ■ > u ir)i where u il} . . . , u ir represent the respective spin states of the 

nodes i\, . . . , i r from the possible states 1, . . . , q, and Ji u ...,i T the strength of the interaction 
among them. A hyperedge exists among nodes ii, . . . , i r if u it = ■ ■ ■ = u ir , i.e., if these 
nodes are in the same spin state. Let us denote the number of system nodes with spin u as 
N u , and the density of these as c u = N u /N, which satisfies XL C « = 1- I n ^ ne homogeneous 
system, since Ji lt ...,i r = J and given that only r-tuples of equal spins contribute to H q (i.e., 
only hyperedges), the energy is equal to H q = —J^2 U ( N r u )> the sum of interaction energies 
among all hyperedges having equal spin. The connection between percolation and the Potts 
model translates into the relation J = — ln(l — p), and for small p, this approximates to 
J ~ p. 

In order to find the Helmholtz free energy of the system, one must first determine the 
partition function Z q . In this model, it can be written on the basis of all configurations of 
state values ttj, or in terms of the densities {A u } n=lj q . Using the later set of variables, and 
taking into account the multiplicity in the choices for each node state, one arrives at 



Z q = Y. e " [ " JE " ( " )] TT^AM = £ e -H£^H-£^H, (43) 



where the inverse temperature parameter /3 is absorbed into J. In the canonical ensemble, 
the free energy is given by F q = —lnZ q . When the interaction J is too weak to keep 
the nodes ordered collectively in groups of common states, the solution to the problem is 
expected to be symmetric, i.e. c u = 1/q (all states are equally occupied). However, as 
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the interaction strengthens, one would expect that symmetry is broken and one state (say 
u — 1) becomes dominant. By these arguments, F q can be sought by introducing the ansatz 



; u = 1 
1, 



(44) 



where f q is the fractional size of the system in state u — 1, and the condition XL C « = 1 is 
automatically satisfied. This leads to 



df q e 



j(f( 1 +(«-%)) + j( g -i)(f( 1 -^))-f(i +(g - 1 )/ 9 ) ln i±^-f(i-/,)i a ^ 



(45) 



In the thermodynamic limit (N — > oo), the Laplace method of integration can be applied to 



26| . Once applied, F q = — In Z q yields to leading order 



^(i+h-i),,* _ t) p ,,: 



+ ^(1 + (,-!)/,) In 



l + to-l)/, + jV (1 _ /)ln i^ (46 ) 
9 9 <? 



where f q is the value of f q for which the exponent of the argument of the Z q integral is 
maximized. Taking the first derivative of the exponent, and using ci(f q ) and c u (f q ) to refer 
to the the fractions c u from Eq. (jHJ) evaluated at u — 1 and u ^ 1, respectively, / g must 
satisfy 

J" - 1 -l /TIT / C \\ 1 -1 



hi 



!-(?-!)/, 



-J 



N Cl (f q 



E 



Nc u (f q 



E 



(47) 



This is the self-consistency equation for the fractional size of the component of broken 
symmetry. For q = 1, / = f q= \ is the fractional size of the percolating spanning cluster. 
Note that f q = is also a solution to Eq. (147]) . but its stability breaks down when the second 
derivative of the exponent integrand of Z q changes sign. This leads to the relation 



Pc ~ Jc 



N- 



d 2 ( N 



(48) 



dN 2 \r 

where q = 1 has already been introduced (otherwise the solution would be the same but 
with N/q in place of iV everywhere). 

In the thermodynamic limit, one can derive a compact equation for s and arbitrary 
r. Both terms in the brackets of Eq. fj47l) are polynomials emerging from the derivative 
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which we label M(x,r) = Y^m=o Or("^)a; m , evaluated at x — Nc± and Nc u (we 
continue the same shorthand of c u ^\ = c u ). Subtracting them, one obtains M(Ni,r) — 
M(N u ^i,r) = Yl T m=i a r{m){N™ — N™), where the coefficient m = vanishes. It is possible 
to express the coefficients a r (m) in terms of elementary symmetric polynomials and binomial 
coefficients, but the analysis here is restricted to the asymptotic limit, and thus only requires 
the coefficient a r (r — 1), equal to l/(r — 1)! as can be determined by inspecting M(x,r). 
Using the identity x m —y m = (x—y) x m ~ l ~ l y\ the ansatz OH]), and the self-consistency 

relation ( 147|) with q = 1, one obtains 

r— 1 m—1 i — 1 

ln(l - /) = -Jf a r (m)N m Y,(l ~ /)' = -J E ^(m)N m (1 - (1 - f) m ) . (49) 

m=l Z=0 m=l 

Close to percolation, it is justified to write J = XJ C , with A > 1 and J c from Eq. (|48p . By 
L'Hopital's rule, for the dominant term in N, the size of the largest component emerges as 

ln(l-/) = -^ T (l-(l-/r 1 ), (50) 
r — 1 



which generalizes expressions for this quantity for specific small r values found in [19j, |20 ] . 
To test this expression, it is customary to define the percolation problem with respect to a 
network (or hypergraph) that is not complete, but instead is already diluted. By defining 
the rescaling z = p/p ma , x where typically p max 1, the original undiluted hypergraph is 
2 = 1, and percolation occurs at z c = p c /Pmax, or if using A max , z c = 1/A max (see FigH^a)). 

The percolation transition can be shown to be second order by expanding both sides of 
Eq. fl50|) . which leads to 



r-1 



9=1 y g'=i v y 7 

For small /, close to the percolation transition, only the first few terms on both sides of the 
equality are relevant. Retaining up to second order 



2 r - 



('•-D-C 2 1 )l 



(52) 



which produces 



/ ^ %- ~ 2(A - 1) (53) 

J l + (r-2)A V ; V ; 

clearly indicating a continuous transition, in the same universality class of regular network 
percolation, which diverges at the transition with exponent 1. This result is known in the 
literature [l9j |. 
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The previous results focus on hypergraphs, but their relevance to projected networks is 
not explicitly clear. To clarify this, it is sufficient to explore the properties of 4>ij(oij,p). 
For this, it is useful to have in mind the asymptotic relations N-Jjj^ (^) ~ (y-2)! anc ^ 
i^-i) ~ (r-iy. • inse rti n g V = ^Pc in Eq. f[2"Tj) . and taking the limit N o^-, the re- 
lation (f)ij, S pazBe(oij, Xpc) = <f>[j (oy, A) = (X/N) 0ij e x ^ N /oij\, which is a poisson distribution 
with average X/N (Fig. 11(b)). Therefore, as N increases, the weights on the edges van- 
ish, signalling the fact that in this dilute regime, the hypergraph and projected networks 
are virtually the same, and hyperedges are non-overlapping asymptotically. Thus, one 
only needs to calculate the hypergraph percolation properties to be able to write down 
the projected network percolation properties. In this sparse regime, the other distribu- 
tions discussed above have particular forms: for the hypergraph, the distribution of hy- 
peredges visiting a node becomes <^ ■ (^t, A) = (A/(r — l))^ i e~ A ^ r_1 ^/£ i ! (poisson with av- 
erage A/(r — 1)), and the strength distribution on projected networks with V a becomes 
^ s \ Su A) = (A/(r - l))*/(r-i) e -A/(r-i)/[( r - l)( Si /(r - 1))!] (Fig. g(c)). From these results, 
the meaning of A emerges as the parameter that measures the average node strength of the 
projected network. Finally, the degree distribution can be calculated if one keeps in mind 
that in the sparse limit, the probability that hyperedges overlap is minimal, and therefore, 
one expects that only the minimum number of hyperedges 1^ — > \ki/(r — 1)] contribute to 
the distribution. There are subtleties present in explicitly calculating Q r -i(ki, \ki/(r — 1)]) 
and tp^ s \ki, X) when k{ is not a multiple of r — 1 because hyperedges are forced to over- 
lap in this case, and thus to avoid further details, I only write the unevaluated result 
4 S \hA) = Qr-i(k, \ki/(r - l)l)(Ap c )^/(-Dl(i _ Ap c )(?-iH*</Cr-i)l (Fig . |^ d )). H ow- 
ever, the calculations are not prohibitive, and will be derived in detail in a forthcoming 
publication. 

The sparse regime close to percolation is not the only possible sparse regime. To be 
concrete, note that for p close to p c , the average node strength is constant, but the average 
overlap on projected edges vanishes linearly with N, so the larger the network, the less 
interaction present along the edges. However, one can consider a regime in which (oy) ~ X/N 
is constant, and in this regime node strength increases with N. Both of these regimes are 
"sparse" in the sense that p vanishes asymptotically, but each regime has specific properties. 
Generally, these sparse regimes can be defined based on any sensible property, and lead to 
interesting behaviors. Finally, for the dense regime (p constant), the interesting effect of 



20 



growth of (oij) vs. N emerges, which is a unique feature of this model. 

In conclusion, in this article I present a model of hypergraphs and associated projected 
weighted networks that offers a concise and intuitive picture of hypergraphs, networks, and 
weights. By using statistical mechanics concepts, together with combinatorial tools, I have 
been able to determine some basic features of homogeneous and heterogeneous projected 
networks that offer concrete tests to determine whether a network that has been empirically 
measured may bear the signature of multiway (group) interactions. The general idea of 
using the projection of a hypergraph onto a network has not been well studied, and deserves 
a close look to determine further properties that can help give a better understanding to the 
genuine limits and virtues of pairwise network analysis. 

The author thanks L. Roberts, A. Gerig, F. Reed-Tsochas, and O. Riordan, for helpful 
discussions, and TSB/EPSRC grant SATURN (TS/H001832/1) and ICT eCollective EU 
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Set notation 


Explanation 


Type of element 


Size 




Hyperedges of complete hypergraph 
simultaneously visiting i and j 


hyper edge 


V r— 2 / 


Oij (a) 


Hyperedges of configuration a si- 
multaneously visiting i and j 


hyper edge 


Oil 




Collection of all possible sets Oij(cr) 


Set of cardinality o-ij of 
hyperedges 





TABLE I: Notation used for calculation of <fiij(oij,p). The complement sets Oij(a) are with respect 
to Tij(N,r). 



Set notation 


Explanation 


Type of element 


Size 


Ti(N,r) 


Hyperedges of complete hypergraph 
visiting i 


hyper edge 




Ai(«r) 


Hyperedges of configuration er visit- 
ing i 


hyper edge 




A l 


Collection of all possible sets Aj(tr) 


Set of cardinality i{ of 
hyperedges 





TABLE II: Notation used for calculations of d(£i,p) and £i(si,p) in the V a projection. The 
complement sets Aj(cr) are with respect to Tj(iV, r). 
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Set notation 


Explanation 


Type of element 


Size 




Hyperedges in configuration a visit- 
ing i plus hi other nodes 


hyper edge 




i V / 


Hyperedges in configuration a visit- 
ing i plus the /cj nodes in set p(ki) 


hyper edge 




p(ki) 


Choice of h nodes (plus i) in <x con- 
nected to i via £j hyperedges 


node 




Ki(£i) 


Collection of all possible sets 
KfHa) 


Set of cardinality i\ of 
hyperedges 


( i 1 C^r* — 1 ( , £<) ") 




Collection of all possible sets 


Set of cadinality ti of 
hyperedges 


— 1 i ) 


Ri(JV,fci) 


Collection of all possible sets p(h) 


Set of cardinality hi of 
nodes 





TABLE III: Notation used for calculation of ipi(ki,p). The complement sets K\ * (a) and 
j( P (ki)A)^ are with regpect tQ Tj(iV, r). 
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Hypergraph: Projected network: 




FIG. 1: Illustration for the projection V a from hypergraphs to networks. On the left, a hypergraph 
is composed of a multitude of hyperedges that exist when a = 1, and do not when a = 0. The 
projected network (right) has a link between all nodes that belong to the same hyperedge, and the 
weight of the link is the number of hyperedges that share the same pair of nodes. 
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FIG. 2: Comparison between theoretical distributions (lines) and simulations (symbols) for distri- 
butions of homogeneous projected networks for N = 32 and r = 3: (a) (j)ij(oij,p) from Eq. (|2ip 
for N = 32 and corresponding simulations (O f° r P = 0.2 and □ for p = 0.4); (b) £i(si,p) from 
Eq. (|27p for iV = 32 and corresponding simulation (Q f° r P = 0.02 and □ for p = 0.05); (c) tpi{ki,p) 
from Eq. (|36p for = 32 and corresponding simulations (Q f° r P = 0.02 and □ for p = 0.05). (d) 
Average degree (hi) as a function of p in homogeneous networks from Eq. (|38p and from simulations 
(O for = 32 and □ for N = 64). 
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Sets: p(k r 2)={a,b, c ,i} 

1,(^=3)^=2) ={{(aAi%(b,cMd(aAi)McMd(a,cMb,cM} 
1(00(^=3)^=3) ={{(a^),0, (a,c,/),(6,c,0}} 

Configurations: 

7((p(i i =3),4=2)(ff 1 )={(fl,6,i),(6.c,i)} 7 i (p(A i =3),4=2)(^)={( fl , C) 0,(6 ) c,0} 
b b 




i 



FIG. 3: Illustration (r = 3) of the emergence of degree ki as a consequence of various possible 
hyperedge configurations. The figure also illustrates Q r -\(ki, ti). There are 4 possible ways in 
which i can be connected to nodes {a,b,c}, each case corresponding to one of the configurations 
shown above (<ri, <t 2 , <t 3 , <t 4 ) in the projected network. The sets Ii(p(ki),£i) are defined for both 
li = 2 and 3, the only two possible cases. Note also that if one focuses only on the nodes {a, b, c} 
ignoring i, all configurations can be mapped to the construction of all possible cliques of size 2 of 
these nodes, generating Q r -\=2{ki = 3,£i = 2) = 3 and Q r -i=2(ki = 3,£« = 3) = 1. The fact that 
all configurations are globally connected is an accident due to the small value of ki = 3, but in 
general nodes simply need to belong to 1% cliques of size r — 1. Finally, note the thickness of links, 
representative of o%j. 
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FIG. 4: (color online) Percolation limit for the ensemble of (s) = A max : (a) f(z)/f(z = 1) vs. z 
(A m ax = 3.0) from Eq. flSJJI (line) and simulations of N = 64 (Q), N = 128 (□), JV = 256 (O) 
and N = 512 (a). As the system size increases, the theoretical solution is approached. Projected 
network properties (V a for Sj) for N = 128 in the ensemble of (s) = A = 4.0 predicted by theory 
(line), their respective sparse approximations (dashed line) and simulations (0) : 0°) VijiPiji ^)>( c ) 
&\ Si ,\), and(d) 4 s \ki,X). 
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